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Method for writing data to a non-volatile memory embedded in an integrated circuit and 
corresponding circuit 

The invention relates to a method for writing data from a processor to a non-volatile 
5 memory embedded in an integrated circuit and to a corresponding integrated circuit 
particularly adapted to execute this method. 

Such integrated circuits are extensively used in portable devices, e.g. wireless modules, 
mobile phones, digital cordless devices or personal digital assistants (PDAs). In this 
10 market, size and power consumption are decisive factors, thus the small area factor is a 
primary request. Thus, the integration of a non-volatile memory with a memory control 
unit (MCU) of the baseband system seems to be an adequate choice to reduce the 
number of components and pinouts and to improve routability and thus reduced overall 
system cost. 

15 

On the other hand, there are several other aspects contributing to improve the 
performance/cost ratio of a baseband system on chip (SOC). 

One of the most effective characteristics is an appropriately chosen memory hierarchy. 
20 Distributing the memory resources through a hierarchy with different access capabilities 
and different costs and sizes results in an overall improvement of the system 
performance. 

Fig. 1 shows a typical, so-called Harvard Architecture. A bridge 1Q1 is connected over a 
25 32-bit bus 104 to advanced system bus (ASB) modules 105, a memory management unit 
(MMU) 106, a RAM 107 and a flash memory 102. An advanced RISC machine (ARM) 
103 is connected to the MMU 106. Over a 16-bit bus 108, the bridge 101 is connected 
to advanced peripheral bus (APB) modules 109. These can be universal serial bus 
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(USB) modules, for example. It is evident that the main costs in terms of area are in the 
memory hierarchy. So a smart choice of the sizes and types of memory, strictly 
depending on the performance required, is the base of design success. 

5 Since many applications require a certain layer of software for the microcontroller within 
a Harvard Architecture, a read only memory (ROM) is typically used for the instruction 
memory (IMEM), while for the data memory a volatile memory, e.g. a DRAM or 
SRAM, is mostly chosen. This provides the chance to use the bootcode memory space 
on the die also as instruction memory, thus saving space and increasing the robustness 
10 and the security of the system. The advantage is that an embedded ROM, being 
integrated with the baseline process, does not require special technology options. 

However, such a solution requires a stable software and is thus suitable only for final, 
high volume production. Since most applications need several software refinements, the 
15 ROM solution is not suitable for the IMEM. Thus an embedded flash memory used as 
IMEM could represent the best solution in terms of size and performance. 

Moreover, the microprocessor has to deal with a more complicated writing procedure, 
consisting in loading the data into the memory and then programming the cells. 

20 

If the system has been designed to use the flash memory 102 as instruction memory, the 
microprocessor 103 will perform most of the time read accesses to this flash memory 
102 whereas it will write into the RAM 107. 

25 Unfortunately, if the software requires to store some data permanently, a write operation 
to the flash memory 102 is required. In this case, the execution code for programming 
the data must not run from within the flash memory 102, because the flash memory 102 
is not available for simultaneous read access - the fetching of the instruction from the 
microprocessor - and program access. 



m • 
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Differently from a read/write/read sequence to a RAM, where the write operation is self 
reset, the same sequence of operations is not allowed for a flash memory because this 
sequence will cause a conflict. 

5 Flash memories need to be set in read or program mode. This is done by executing the 
proper commands through the software code. So a general sequence of commands 
which wants to perform the mentioned sequence will be: 

(1) Set the flash memory in program mode; 

(2) Write DATA-1 to be programmed to flash ADDRESS- 1; 
10 (3) Write DATA-2 to be programmed to flash ADDRESS-2. 

As soon as the instruction (1) is executed, the flash memory 102 is not available for 
reading anymore. So no other instruction could be fetched and executed. Moreover and 
more fatal, the flash memory 102 shown in Fig. 2 will stay in this mode until a new 
15 instruction will set it again in read mode. But because the processor 103 is not able to 
fetch instructions from the flash memory 102 anymore, this will never happen. This 
obviously creates a deadlock for the system. 

In Caceres et al. US patent 5 802 554, a method and a system for reducing memory 
20 access latency by providing fine grain direct access to flash memory concurrent with a 
block transfer there from is described. The memory management unit controls the 
content of the cache with data either from a first level of memory hierarchy or a second 
level of memory hierarchy. However this solution cannot use the flash memory as 
IMEM. 

25 

A first and main object of the invention is to provide a method for writing data from a 
processor to a non-volatile memory embedded in an integrated circuit and a 
corresponding integrated circuit, which method lets the processor "see" the non-volatile 
memory as instruction memory and as volatile memory. 
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• In other words, the invention consists in a method and a hardware implementation which 
hides the writing access to an embedded flash memory, deviating the data to the RAM 
through a direct memory access (DMA) approach, until all the data have been sent from 
the microprocessor. In this way, the microprocessor is able to continuously fetch 
5 instructions from the flash memory which works as instruction memory. Nevertheless, 
the flash memory is still available in read mode until it receives the final program 
command. 

With the invention, the microprocessor can use the embedded flash memory as 
10 instruction memory and can write into those parts/space of the same memory which has 
to be programmed. 

The architecture according to the invention can drastically reduce the software 
development time by embedding a mechanism which lets the microcontroller see the 
15 memory space simply as a normal volatile memory space. 

The problem is solved by a method for writing data from a processor to a non-volatile 
memory embedded in an integrated circuit, with the features according to an independent 
method claim and by an integrated circuit with the features according to an independent 
20 apparatus claim. 

The method for writing data from a processor to a non-volatile memory embedded in an 
integrated circuit according to the invention comprises the following steps: 

25 The data, which shall be written to the non-volatile memory, will be transferred at first to 
a volatile memory. 

As soon as the data have been transferred to the volatile memory, a wait signal will be 
send to the processor. 

Afterwards the data will be transferred from the volatile memory to the non-volatile 
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At last, the wait signal will be removed. 

The integrated circuit according to the invention comprises a processor, a volatile 
memory, a non-volatile memory and an interface connecting the processor to the volatile 
5 memory and the non-volatile memory to the volatile memory. The interface is equipped 

- to transfer data, which are to be written to the non-volatile memory, at first to the 
volatile memory, 

- to send as soon as the data have been transferred to the volatile memory, a wait signal 
to the processor, 

10 - to transfer the data from the volatile memory to the non-volatile memory, and at last 

- to remove the wait signal. 

Advantageous further developments of the invention arise from the characteristics 
indicated in the dependent claims. 

15 

In one embodiment of the invention, the data transfer is controlled by an interface. 

In an another embodiment of the invention, the non-volatile memory will be set into the 
write mode at the beginning of the data transfer from the volatile memory to the non- 
20 volatile memory. 

In a further embodiment of the invention, during the data transfer from the volatile 
memory to the non-volatile memory, the non-volatile memory will be set into the 
program mode. 

25 

In a still further embodiment of the invention, at the end of the data transfer from the 
volatile memory to the non-volatile memory, the non-volatile memory will be set in read 
mode. 
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Advantageously all of the data are transferred at first to the volatile memory. With this 
kind of block-wise data transfer, the whole system can be accelerated. 

In an another embodiment of the invention, the addresses corresponding to the data, 
5 which shall be written to the non-volatile memory, are stored intermediately. 

To avoid the risk of a deadlock, a wait signal will be send to the processor before the 
data are written to the volatile memory and will be removed after the data are written to 
the volatile memory. 

0 

In a further embodiment of the invention, the non-volatile memory of the integrated 
circuit is a flash memory and/or the volatile memory is a embedded RAM. 

Subsequently, the invention is further explained with the drawings showing in 



15 



Fig. 1 



an advanced RISC machine based subsystem with an 
embedded flash memory according to the state of the art; 



Fig. 2 



a block diagram of the communication between a flash memory and a 



20 



processor; 



Fig. 3 



a diagram with a sequence of instructions to avoid a 
deadlock of a first embodiment; 



25 Fig. 4 



a diagram with a sequence of instructions to avoid a 
deadlock of a second embodiment; 



Fig. 5a 



the first part of a flow diagram of the operations together with 
corresponding block diagrams of an integrated circuit according to the 
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the second part of the flow diagram shown in Fig. 5a; 



Fig. 6a 



the first part of a block diagram of an implementation of the integrated 
circuit according to the invention; 



5 



Fig. 6b 



the second part of the block diagram shown in Fig. 6a; 



Fig. 7 



a diagram of a main state machine according to the 
invention; 



10 



Fig. 8 



a behaviour description of an embedded direct memory access (DMA) 
engine according to the invention. 



A solution for the deadlock problem occurred in the architecture according to Fig. 1 is 
15 to download the portion of code which performs the program operation into the RAM 
107, then branch to the first instruction of this program routine in the RAM 107, execute 
the program operation from within the RAM 107 and finally, after the programming of 
the flash memory 102, jump again to the code in the flash memory 102. This is shown in 



The main limitation of this software approach lies in the difficulty to properly manage the 
code and the addresses. If the code is written in a high-level language, e.g. in C- 
language, the software engineer has to deal with massive use of pointers and function 
calls through pointers in order to execute the program routine not from within the flash 
25 memory 102, but within the RAM 1 07. 

Moreover, the program routine shown in Fig. 4 is using some variables stored in the 
flash memory address space, which is a common situation, whereby the compiler is not 
aware from where the routine will be executed. As soon as during the execution of the 
30 program routine such variables are called, an exception error will occur, because the 



Fig. 3. 



20 
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flash memory 102 is not accessible in read mode. 

Finally, to write the copy routine, which should copy the program routine from the flash 
memory 102 to the RAM 107, the software engineer has to know information about the 

5 program routine, like the start address, the end address and the size of the portion of 
code to transfer. But these are not available at the moment of the software writing. So 
the only possibility is to compile a first draft version of the complete code, then, from the 
assembly version of the compiled code, extract the required information, rewrite the 
code and recompile it. Any further modification on the code will change the relative 

10 addressing of the instructions and will need such an approach again for an iterative 
approximation. 

The following is an example of how to manage the dumping of the routine which must 
be executed from the RAM 107 instead from the flash memory 102 through a high level • 
1 5 language approach. A memcopy command will copy the routine in the RAM 1 07 and, 
through a proper pointer assignment, it will be possible to call the routine to be executed 
from this RAM address. Unfortunately this does not avoid the occurrence of errors if a 
variable calling will try to access to the flash memory address space. 

20 The memcopy command needs the start address and the size of the routine to be copied. 
So a first draft compilation has to be done to get the start address and the size of the 
routine, because these become available only after a compilation. Now the correct 
information can be inserted in the code and compiled again. Each modification in the 
code will change the addresses of the instructions in the compiled code, so the procedure 

25 described so far must be performed again. 



A less user-friendly but more accurate approach is to write part of the code directly in 

assembly language. The sequence: 

(1) copying the flash program routine in the RAM; 
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(3) jump back to the flash memory; 

is split into two routines instead of being managed through library commands, like 
memcopy, and function calls through pointers. To achieve the above mentioned 
sequence, in the main C-code, the following assembly routine is called: 
5 (a) assembly routine: 

(a. 1) copy the code from the flash memory to the RAM; 

(a.2) force the program counter of the processor to the first instruction address of the 
RAM where the routine has been copied. 
The flash program routine is now executed from the RAM 
10 (b) flash program routine running from RAM: 
(b. 1) program the flash memory; 

(b.2) call an other assembly routine to force the program counter of the processor to the 
first instruction address of the rest of the code in the flash memory address space. 

15 The execution from now on will continue from the flash memory. 

In this way, the routine executing from the RAM 107 will first program the flash 
memory 102, then call an assemble routine which forces the program counter in the flash 
memory address space, and then continue to execute the code, but from the flash 
20 memory 102. 

The advantage of this approach compared to the above-mentioned solution is the direct 
control of the addresses of the instruction, the compiler not being involved in the 
copying and in the jumps to and from the RAM 107 and so fewer possibilities of wrong 
25 attempts to access the flash memory 102 during the program operation. 

Also in this case, a first draft compilation is required at every modification of the code in 
order to know where to jump in the RAM 107 and where to jump back into the flash 
memory 102. 



30 
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Finally, for preventing wrong attempts to access the flash memory 102 during executing 
the code from the RAM 107, the safest way is to copy not only the program routine into 
the RAM 107, but also the rest of the code. The disadvantage of this approach is a high 
RAM area consumption. 

5 

An example of how to manage the dumping of the routine which must be executed from 
the RAM 107 instead from of the flash memory 102 through low level language 
approach is given below. 

10 (a) The C-main program will call the first assembly routine. 

(b) This will copy the interesting blocks of flash memory contents into the RAM 107 and 
then will force the program counter to the first instruction of the program now copied 
into the RAM 107. 

(c) From now on, the code will be executed from the RAM 107 and it is possible to 
1 5 program the flash memory 1 02. 

(d) When the program operation is finished, a second assembly routine will be called just 
for forcing the program counter to the next valid instruction of the C-code, but this time 
in the flash memory 102, which is now available again. As before, all information 
regarding where to jump into the RAM 107 and into the flash memory 102 becomes 

20 available only after a first draft compilation, and must be checked at every code 
modification. 



The target of the architecture described below is to provide a means to the software 
development which allows to program the non-volatile instruction memory without 
25 doing it from an other memory, but simply interfacing it as a RAM and just waiting the 
required program time. 

The following definitions will be used. An interface allows to write through a proper 
load [command f the data to be programmed in an embedded register of data latches. 
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itself will be started. Both operations require to set the flash memory into write mode. 
After setting the flash memory into write mode, these two commands could not be 
executed from within the flash memory. 

5 The invention is not restricted to any special kind of embedded flash memory, but may 
be employed with any non-volatile memory. 

The architecture consists in an interface which, during the load operation, will transfer 
the data to be programmed, rather than in the embedded register of the flash memory, in 

10 the RAM through a kind of direct memory access (DMA), keeping track of the 

addresses where this data must be written in the flash memory, but without setting the 
flash memory in write mode and so still keeping the flash memory available for reading 
the instructions. As soon as all data to be programmed have been sent from the 
microprocessor, the next command will be for the real programming of the flash 

15 memory. In this case, the interface will send a wait signal back to the microprocessor, so 
that the latter will not try to fetch instructions from the flash memory. After setting the 
flash memory in write mode, the interface will start the transfer from the RAM to the 
embedded register into the flash memory for the actual load operation. At the end of the 
load operation, the interface will send the program command to the flash memory and, at 

20 the end of the program operation, the interface will remove the wait signal from the 
microprocessor bus. 

From the processor's viewpoint, a simple write operation has been performed. The write 
operation lasts longer then the write operation for a RAM, due to the program time of 
25 the flash memory, but there are no special time-consuming tricks anymore required for 
the operations of copying/jumping/programming^umping . 



30 



The following describes the operations shown in the flow diagram of Figs. 5a and 5b 
together with the block diagram of Figs. 6a and 6b performed by the interface 605. In 
the flow diagram of Figs. 5a and 5b, six different sections are separated by doted lines. 
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Section 1: The microprocessor 603 is executing the instruction of a load request 
load/prog instruction and is writing the proper value in the flash controller register 
shadow fctr reg 608. From now on, the interface 605 is waiting for the data DATA to be 
loaded into the flash memory 602. 

5 

Section 2: The microprocessor 603 selects the flash memory 602 in write mode and 
sends the data DATA and addresses ADDR on the bus 604. The interface 605 will mask 
the write mode to the flash memory 602 and will catch the data DATA and the addresses 
ADDR. The interface 605 will also generate a wait signal wait for the microprocessor 
10 603 and will store the daX&DATA in the RAM 601. And the interface 605 will remove 
the wait signal wait for the microprocessor 603. The flash memory 602 is still not in 
write mode and the microprocessor 603 could still fetch instructions for example for 
loading the data DATA. 

1 5 Section 3 : Now the microprocessor 603 is executing the instruction of a program 
request load/prog instruction. It is writing the proper value in the flash controller 
register shadow fctr reg 608. After that the interface 605 will again generate a wait 
signal wait for the microprocessor 603. The interface 605 will set the flash memory 602 
in write mode while the microprocessor 603 is in wait state. The interface 605 will read 

20 the data DATA from the RAM 60 1 . 

Section 4: The interface 605 will load the data DA TA in the flash memory data latches. 

Section 5: After the last data DA TA has been loaded from the RAM 601 into the flash 
25 memory 602, the interface 605 will generate a program command for the flash memory 
602 and will poll the ready signal from the flash memory 602 which signal indicates that 
the programming of the flash memory 602 is finished. 

Section 6: When the flash memory 602 generates the ready signal, the interface 605 will 
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microprocessor 603. 

Below, the management of the addresses, both for the RAM 601 and for the flash 
memory 602 will be described. 

5 

The main issues to be solved are: 

1 . Where to temporary store the data DA TA in the RAM 60 1 . 

2. Keep the RAM address to enable retrieval of the data DATA from the RAM 601 
afterwards. 

10 3 . Keep information about where to program the DA TA in the flash memory 602. 

The first problem is managed by a direct memory access (DMA) approach embedded in 
the interface 605, whereas the second and third problems are solved by the embedded 
algorithm in the interface 605. 

15 

For managing the direct memory access (DMA), the interface 605 embeds a DMA 
engine 610. In general, if a DMA controller and a microprocessor share the signal wires 
to the memories, there must be a mechanism to arbitrate which device shall have access 
to the memories. In the here described embodiment of the invention, the interface 605 

20 will access the memories 601 and 602 only after putting the microprocessor 603 in wait 
state through the general wait signal wait The signal wait is always available on the 
system bus, so no other arbitration features are required. On the other hand, the 
microprocessor 603 must transmit the beginning address in RAM 601 to the DMA 
engine 610. With this information, the interface 605 is able to re-direct the data DATA 

25 sent by the microprocessor 603 to the RAM 601, store the RAM start address, and keep 
track of the number of words still to be sent. Moreover there is no risk of corrupting 
data in the RAM 601 writing in a wrong place. The purpose of the DMA engine 610 is 
to select the proper device, RAM 601 or flash memory 602, send or read data to or from 
this device, wait for a ready information from the device, and transmit the ready 

30 information as acknowledgement to the main state machine 617 in the interface 605. 
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In Figs. 6a, 6b, 7, and 8, the implementation of the interface 605 architecture is 
presented. In the block diagram shown in Figs. 6a and 6b, the interface 605, the 
microprocessor 603, and the RAM 601 are connected to the system bus 604 together 
with the control signals wait, nREAD/WRITE, the data signal DATA, and the address 
5 signal ADDR. Moreover the interface 605 needs to access the chip select of the flash 
memory 602 called SELJFLASH and the RAM 601 called SELJiAM. The interface 605 
is mainly based on the main state machine (MSM) 617, the DMA engine 610 and the 
counter 606. The main state machine 606 checks the requests from the microprocessor 
603 and controls the inputs for the addresses and data registers. The counter 606 keeps 
10 track of the number of words written in the RAM 601 in order to retrieve them for the 
flash memory loading. 

The registers are: 

The flash memory start address register 61 1 stores the starting address jZos/f start addr 
1 5 which indicates where to program the flash memory 602. 

The RAM start address register 612 stores the starting address RAM start addr, which 
indicates where to temporary- save the data DATA in the RAM 601 . This starting 
address must be send to the interface 605 from the microprocessor 603 before sending 
20 the data DATA. 

The data buffer 613 stores the data DA TA at every load cycle before sending the data 
DATA to the RAM 601 or flash memory 602. 

25 The last count register 607 stores the maximum number of RAM writing steps 

performed. This register is used to check the end of the successive flash memory loading 
through a comparator 616. 



The flash current address register 614 stores the address currently used by the DMA 
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- The flash memory start address flash start addr for the first loading; and 

- (The flash memory start address flash start addr) + (current counter value) during the 
successive loading cycles; 

5 The RAM current address register 615 stores the RAM address RAM current addr 
currently used by the DMA engine 610 for accessing the RAM 601 . This is: 

- The RAM start address RAM start addr for the first write or read operation; and 

- (The RAM start address RAM start addr) + (current counter value) during the 
successive writing or read operation; 

10 

The shadow flash control register shadow Jctr_reg 608 stores the load command and 
the program command sent by the microprocessor 603 and cached by the interface 605. 

The real flash control register real Jctr_reg 609 stores the load command and the 
15 program command to be sent by the interface 605 to the flash memory 602. 

Fig. 7 shows a representation of the behaviour of the main state machine (MSM) 617, 
i.e. the operations performed by the MSM and its 

co-functioning with the Direct Memory Access state machine (DMA SM) 610. 

20 

State 0: The MSM is in idle. 



State 1 : The MSM has detected a load request from the microprocessor (\xP in the 
following). The 603 has written the specified Load instruction into the shadow flash 

25 control register shadow fctr reg of the Interface 605. 

a) The MSM will set the enable signal enjdatajn and the signals update jxddr and 
datajnjsel to allow the transfer from the data bus 604 to the RAM^startjxddr register 
and RAMjmrrentjxddr. Now the \xP has to send to the MSM the starting address of 
the RAM space which will be used as buffer. The MSM will catch this value from the 

30 DATA bus and store it in the registers RAMjstartjzddr and RAMjmrrentjxddr. 
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Moreover, the MSM will reset the counter 606 used for the incremental address 
calculation. 

State 2: Now the \xP has to send to the MSM the starting address of the RAM space 
5 which will be used as buffer. The MSM will catch this value from the data bus and store 
it in the RAMjstartjxddr register. 

State 3 : The MSM detects that the jxP has selected the flash memory 602 in write mode 
(the detection is made through the check of the signals SELJFLASH and 
10 nREAD/WRTTE available at bus system level, see Fig. 6b). Moreover, the yP has sent 
DATA and ADDR onto the bus. 

a) the MSM will generate a wait signal WAIT for the jjP. 

b) the MSM will set the enable signals enjxddrjn and en_datajn and will catch from 
the DATA and the ADDR system bus. These are the values to be stored in the 

15 DATAJbuffer and FLASH star tjiddr registers. 

State 4: Now the MSM has to re-write to the RAM 601 the data catch. 

a) the address for the RAM is calculated as: v 

value of the register RAMjmrrent addr + value of the counter. 
20 For the first loop the RAM current _addr = RAM_startjxddr and the value of the 
counter = 0 (see state l.a) 

b) Now the MSM will send to the DMA SM the ram_w/te request with the address 
RAMjmrrentjxddr and the date, DATA buffer. 

25 State 5: The MSM waits for the acknowledgment from the DMA SM that the RAM 
write operation has been performed. 

State 6: When the MSM has detected the acknowledgement from the DMA SM, the 
signal WAIT for the juP may be removed. Note that the flash memory is not in write 
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Now the |jP will continue to send DATA and ADDR onto the system bus as if it were 
really loading into the flash memory the DATA to be programmed. The MSM will loop 
between states 4 to 6, deviating the DATA to the RAM and masking the load operation 
to the flash memory, which will stay in read mode. After the jiP has sent all the data to 
5 be loaded and performed or whatever other operation is desired, it will send the final 
program instruction Prog Instruction to the shadow_fctr_reg of the interface. 

State 7: The MSM has detected the Prog Instruction from the pP: 

a) The MSM will put the jaP in wait because from now on the flash memory will not be 
10 available in read mode and so will not deliver instructions to the |aP. 

b) The MSM will store the Flash current addr in the Last Count register. 

c) The MSM will transfer the Flash start jxddr and the RAMjstartaddr (stored in state 
3.b) to the Flash current addr and RAMcurrentaddr respectively. Moreover it will 
reset the counter. 

15 d) The MSM will send to the DMA the RAM read request with the RAM_currentjxddr 
as address and will wait for the DMA acknowledgment DMA Ack. 

State 8: The MSM will now loop until the DMA Ack from the DMA SM will be set (the 
RAMj-ead operation has been performed). The Interface will load the DATA into the 
20 flash memory data latches. 

State 9: The MSM will now perform a real load to the flash memory data latches with 
the DATA retrieved by the RAM and the address in the Flash current jaddr. This will 
be performed by the DMA SM through a flashjoad request by the MSM. Moreover 
25 the MSM will compare the Flash current jzddr with the Last count value (stored in 
state 7.b) to check if it is the last DATA to be loaded. 
If Flash current jxddr is equal to Last count, the MSM will go to 
state 11. 

b) If Flash _current_addr is different from Lastjcount, the MSM will go to state 10. 



30 
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State 10: The MSM will now increment the counter and will update the registers: 

a) Flash current_addr = Flash current addr + Counter, 

b) RAM_current_addr = RAMctirrent addr + Counter; 

c) The MSM will send to the DMA SM the ram_read request with RAM_current_addr 
as address and will go again to state 8 to wait for the DMA acknowledgment. 

State 1 1 : If Flash_current_addr = Last count all the data have been loaded into the 
flash memory and it is now time to program it, sending to the flash memory the real 
program command. 
10 

State 12: Now, depending on the flash memory architecture (this is independent from the 
flash memory module), the MSM will loop either a fixed number of clock cycles 
depending on the program time specified for the flash memory, or will loop until an 
potential ready signal coming from the flash memory at the end of the program time (if 
1 5 this signal exists, it could go through to the DMA SM and the MSM would wait for the 
DMA Ack from the DMA SM as usual). 

-State 13: Having received the DMA Ack from the DMA SM or reached the program 
time, the MSM can now bring the flash memory back in read mode (as for the load and 
20 program, this is done writing the proper bits into the Realjctr_reg). The MSM will 
now remove the signal wait for the uP and will go into an idle (state 0). 

Fig. 8 now shows the behaviour of the embedded Direct Memory Access (DMA) state 
machine 610 according to the invention. A more detailed description of the DMA SM 
25 Sanction follows. 

State 0: The DMA SM is in Idle. 



State 1 : The DMA SM has detected tLRM4_yvrite request from the MSM. 



PHCH020020 EP-P 

19- 



b) the DMA SM sets the signal addr_out_sel to send to the address system bus the value 
of the RAM current addr. 

the DMA. SM may now set the signals enable jxddr out and 

enable_data_out in order to send to the address and data system bus the value of the 
5 RAM_current_addr and the value of the DA TA BUFFER registers. 

d) Depending on the RAM architecture, the write operation could take a fixed number of 
clock cycles or be governed by a ready signal back from the RAM. In any case at the end 
of the write cycle the DMA SM will go to state 4. 

10 

State 2: The DMA SM has detected nRAM_read request from the MSM. 

a) the DMA SM resets the signal dmaack used by the MSM as acknowledgment. 

b) the DMA SM sets the signal addr_out_sel in order to send to the address system bus 
the value of the RAM_current_addr. 

1 5 c) the DMA SM can set now the signals enable jtddrjmt in order to send to the address 
system bus the value of the RAM_current_addr. 

d) Depending on the RAM architecture, the read operation could take a fixed number of 
clock cycles or be governed by a ready signal back from the RAM. In any case, the , 
DMA SM will go to state 4 at the end of the read cycle. 

20 

State 3: The DMA SM has detected a FLASH Joad request from the MSM. 

a) the DMA SM resets the signal dma_ack used by the MSM as acknowledgment. 

b) the DMA SM sets the signal addroutjsel in order to send to the address system bus 
the value of the FLA SH current addr. 

25 c) the DMA SM can now set the signals enable addr out and 

enable _data_out in order to send to the address and data system bus the value of the 
FLASH_current_addr and the value of the DATA_BUFFER registers, 
d) Depending on the flash memory architecture, the load operation could take a fixed 
number of clock cycles or be governed by a ready signal back from the flash memory. In 

30 any case, at the end of the load cycle, the DMA SM will go to state 4. 
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State 4: When the memory operation is concluded, the DMA SM sets the signal 
dma jxck to 1 for the MSM and returns to the idle, state 0. 

This concludes the description of the operation of both the main state machine (MSM) 
5 and the Direct Memory Access state machine (DMA SM). 

The architecture described has several advantages. The software development time is 
reduced and a robust system developed with a reusable application code, which focuses 
on the application and not on the limits of the hardware. 

10 

The method according to the invention with its architectural implementation is well 
suited for all applications in which a microprocessor uses an embedded flash memory as 
instruction memory. 

15 
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1. A method for writing data (DATA) from a processor (603) to a non-volatile memory 
(602) embedded in an integrated circuit, 

comprising the following steps 

(a) at least part of said data (DATA) to be written to said non-volatile memory (602) is 
5 transferred to a volatile memory (601), 

(b) when said data (DATA) has been transferred to said volatile memory (601), a wait 
signal (wait) is sent to said processor (603), 

(c) said data (DATA) is transferred from said volatile memory (601) to said non-volatile 
memory (602), 

10 (d) said wait signal (wait) is removed. 

2. The method for writing data according to claim 1, wherein 
the data transfer is controlled by an interface (605). 

15 3. The method for writing data according to claim lor 2, wherein 

at the beginning of the data transfer from the volatile memory (601) to the non-volatile 
memory (602), said non-volatile memory (602) is set in write mode. 

4. The method for writing data according to any of the preceding claims, wherein 
20 during the data transfer from the volatile memory (601) to the non-volatile memory 
(602), said non-volatile memory (602) is set in program mode. 
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5. The method for writing data according to any preceding claim, wherein 

at the end of the data transfer from the volatile memory (601) to the non-volatile 

memory (602), said non-volatile memory (602) is set in read mode. 

5 6. The method for writing data according to any preceding claim, wherein 
all of the data (DATA) is transferred first to the volatile memory (601). 

7. The method for writing data according to any preceding claim, wherein 

the addresses (ADDR) corresponding to the data (DATA) to be written to the non- 
10 volatile memory (602) are stored intermediately. 

8. The method for writing data according to any preceding claim, wherein 

before the data (DATA) is written to the volatile memory (601), the wait signal (wait) is 
sent to the processor (603) and is removed after said data (DATA) is completely written 
1 5 to said volatile memory (60 1). 

9. An integrated circuit with a processor (603), a volatile memory (601), a non-volatile 
memory (602), and an interface (605) connecting said processor (603) to said volatile 
memory (601), and said non-volatile memory (602) to said volatile memory (601), 

20 wherein 

said interface (605) is equipped: 

- to transfer data (DATA) to be written to said non-volatile memory 
(602) first to said volatile memory (601), 

- to send a wait signal (wait) to said processor (603) when said data (DATA) is 
25 transferred to said volatile memory (601), 

- to transfer said data (DATA) from said volatile memory (601) to 
said non-volatile memory (602), and 

- to remove said wait signal (wait). 
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10. The integrated circuit according to claim 9, wherein 
the non-volatile memory (602) is a flash memory. 

11. The integrated circuit according to claim 9 or 10, wherein 

the volatile memory (601) is a random access memory or a static random access 
memory. 

12. The integrated circuit according to claim 9, 10 or 1 1, wherein 
the volatile memory (601) is an embedded volatile memory. 
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Method for writing data to a non-volatile memory embedded in an integrated circuit and 
corresponding circuit 

The invention is a method and a circuit for writing data (DATA) from a processor (603) 
5 to a non-volatile memory (602) embedded in an integrated circuit. The main objective is 
to optimize the use of this embedded non-volatile memory. The method comprises a 
number of steps: The data to be written to the non-volatile memory (602) is first 
transferred to a volatile memory (601). Thereafter, a wait signal (wait) will be send to 
the processor (603). Then, the data (DATA) will be transferred from the volatile 
10 memory (601) to the non-volatile memory (602). At last, the wait signal (wait) will be 
removed. Thus the non-volatile memory (602) can be used both as instruction memory 
and as RAM, which achieves the main goal of this invention. The corresponding 
circuitry is a complex integrated circuit equipped to execute the above functions. 

15 Figs. 5a and 5b 
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